Skip to content

Conversation

@srikanthpadakanti
Copy link
Contributor

@srikanthpadakanti srikanthpadakanti commented Jan 7, 2026

Description

This change introduces the mvcombine PPL command, which combines values of a specified field across rows that are identical on all other fields in the current result set.

mvcombine behaves as a pipeline-level grouping operator: it collapses matching rows into a single row and aggregates the target field values into a multivalue field, while preserving all non-target fields.

Command
mvcombine [delim=""]

Arguments

field (required)
The field whose values will be combined.

  • Must be a direct field reference.
  • Must be a single-valued (scalar) field.
  • If the field does not exist in the current schema, the command fails.

delim (optional)
Delimiter to be used when rendering a single-value representation of the combined field.

  • Defaults to a single space (" "), consistent with Splunk.
  • Accepted for forward compatibility.
  • Has no observable effect in this implementation, since nomv is not yet supported.

Semantics

mvcombine operates on the current result set.

Rows are grouped where all fields except have identical values.

For each group:

  • All non-target fields are preserved.
  • The target becomes a multivalue array containing the combined values.
  • Grouping is not limited to adjacent rows; all matching rows in the result set are combined.
  • Value order is preserved from the input stream order at the point where mvcombine executes.
  • Users must apply sort before mvcombine if deterministic ordering is required.
  • Rows missing the target field do not contribute a value to the combined output.
  • If the target field does not exist or is not a direct field reference, the command fails with an error.

Scope clarification

  • This PR implements multivalue output only, which is the default and primary behavior in Splunk.
  • nomv support (single-value string output using delim) is out of scope for this change and will be introduced separately.
  • The delim argument is parsed and validated for syntax parity and forward compatibility, but does not affect output in this implementation.

Related Issues

Resolves #4766
#4766

Check List

  • [ X] New functionality includes testing.
  • [ X] New functionality has been documented.
  • [ X] New functionality has javadoc added.
  • [ X] New functionality has a user manual doc added.
  • [ X] New PPL command checklist all confirmed.
  • [ X] API changes companion pull request created.
  • [ X] Commits are signed per the DCO using --signoff or -s.
  • [ X] Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Srikanth Padakanti and others added 7 commits January 6, 2026 17:33
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Copy link
Collaborator

@yuancu yuancu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing! I left a few comments.

Signed-off-by: Srikanth Padakanti <[email protected]>
@srikanthpadakanti
Copy link
Contributor Author

Addressed the comments and tested locally to verify the results.
Please take a look and review this before the code-freeze.
@ykmr1224 @penghuo @yuancu @LantaoJin @anasalkouz @dai-chen

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`:
- Around line 3353-3360: The metadata NULL projection is using the
post-aggregate schema from relBuilder.peek() which misindexes fields; in
CalciteRelNodeVisitor, capture and store the original input field types at the
start of visitMvCombine (e.g., a List of RelDataType or Type information from
the input row type) and pass that into restoreColumnOrderAfterArrayAgg; then
modify restoreColumnOrderAfterArrayAgg so when isMetadataField(fieldName) is
true it looks up the type from the saved original input types (instead of
relBuilder.peek().getRowType().getFieldList().get(i).getType()) and uses that
type for makeNullLiteral, ensuring projections are built from the original
schema.

In `@docs/user/ppl/cmd/mvcombine.md`:
- Line 60: Update the documentation descriptions so the dataset name matches the
queries by replacing references to "mvcombine" with "mvcombine_data" in the text
surrounding the examples (the occurrences around the lines that currently say
"Given a dataset mvcombine"), ensuring both places (the one at line ~60 and the
one at line ~92) use "mvcombine_data" to match the query source names.
- Line 117: Rename the heading text "Example 5: Error when field does not exist"
to "Example 4: Error when field does not exist" so example numbering is
sequential; locate the heading string in the mvcombine docs (the line containing
"Example 5: Error when field does not exist") and update it, then scan
subsequent example headings to ensure no other numbering gaps or off-by-one
issues.
- Line 58: The "Example 2" line is missing the Markdown header prefix and the
example numbering jumps (Example 3 -> Example 5); update the "Example 2" line to
include the same header style used for other examples (add "## " before "Example
2") and renumber the subsequent examples to maintain a sequential order (adjust
"Example 5" to "Example 4" or otherwise fix numbering so Example 1, Example 2,
Example 3, Example 4, etc.), referencing the visible example headings "Example
1", "Example 2", "Example 3", and "Example 5" to locate the spots to change.

In
`@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java`:
- Around line 195-203: Add missing null/boundary/invalid-input tests for the
mvcombine command: extend CalciteMvCombineCommandIT by adding new test methods
alongside testMvCombine_missingField_shouldReturn4xx that call executeQuery with
cases for null/empty input (e.g., "mvcombine " with no field or explicit null
values), boundary/empty collections, non-direct field references (e.g., function
or expression instead of field name) and already-multivalue fields; assert
proper 4xx/validation responses or expected behavior via ResponseException
status checks and positive assertions where applicable; use the existing pattern
in testMvCombine_missingField_shouldReturn4xx and reference executeQuery and
mvcombine to locate where to add tests.
- Around line 122-149: The current test
testMvCombine_multipleGroups_producesOneRowPerGroupKey uses loose OR checks on
mv0/mv1 which can hide grouping regressions; tighten it by asserting the sorted
row IPs explicitly (inspect result -> datarows -> each row's ip column) to
ensure row 0 is "10.0.0.7" and row 1 is "10.0.0.8", then validate the mv
contents per group deterministically (check mv0 contains the expected set for
10.0.0.7 and mv1 contains the expected set for 10.0.0.8) instead of the current
OR-based assertions—update references in this test to use the ip values from
r0.get(0)/r1.get(0) and assert exact membership of mv0/mv1 accordingly.

Srikanth Padakanti added 4 commits January 27, 2026 12:38
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Srikanth Padakanti and others added 3 commits January 27, 2026 17:34
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
@srikanthpadakanti
Copy link
Contributor Author

@ykmr1224 @anasalkouz
I’m trying to get this change merged into the 3.5 release. Could you please run the workflows and let me know if you have any comments or concerns on the code?

Copy link
Collaborator

@yuancu yuancu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change!

@LantaoJin LantaoJin merged commit 92e73ea into opensearch-project:main Jan 28, 2026
37 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-5025-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 92e73ead7c05c97668510de5cae246bdfdcff394
# Push it to GitHub
git push --set-upstream origin backport/backport-5025-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-5025-to-2.19-dev.

@LantaoJin
Copy link
Member

Thanks for your contribution, merged to main

asifabashar pushed a commit to asifabashar/sql that referenced this pull request Jan 28, 2026
* MvCombine Command Feature

Signed-off-by: Srikanth Padakanti <[email protected]>

* MvCombine Command Feature

Signed-off-by: Srikanth Padakanti <[email protected]>

* Add doctests to MvCombine

Signed-off-by: Srikanth Padakanti <[email protected]>

* spotlesscheck apply

Signed-off-by: Srikanth Padakanti <[email protected]>

* spotlesscheck apply

Signed-off-by: Srikanth Padakanti <[email protected]>

* spotlesscheck apply

Signed-off-by: Srikanth Padakanti <[email protected]>

* spotlessapply

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Add mvcombine to index.md

Signed-off-by: Srikanth Padakanti <[email protected]>

* Remove the nomv related implementation as that command is still not yet implemented

Signed-off-by: Srikanth Padakanti <[email protected]>

* Remove the nomv related implementation as that command is still not yet implemented

Signed-off-by: Srikanth Padakanti <[email protected]>

* Remove the nomv related implementation as that command is still not yet implemented

Signed-off-by: Srikanth Padakanti <[email protected]>

* Remove the nomv related implementation as that command is still not yet implemented

Signed-off-by: Srikanth Padakanti <[email protected]>

* complete the checklist from ppl-commands.md

Signed-off-by: Srikanth Padakanti <[email protected]>

* spotlessApply

Signed-off-by: Srikanth Padakanti <[email protected]>

* Add visitMvCombine method to the FieldResolutionVisitor

Signed-off-by: Srikanth Padakanti <[email protected]>

* Apply spotlesscheck

Signed-off-by: Srikanth Padakanti <[email protected]>

* Add changes to exclude the metadata fields and remove the CAST logic

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address CrossClusterSearchIT comment

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address CrossClusterSearchIT comment

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address CrossClusterSearchIT comment

Signed-off-by: Srikanth Padakanti <[email protected]>

* Coderrabbit issues

Signed-off-by: Srikanth Padakanti <[email protected]>

* Coderrabbit issues

Signed-off-by: Srikanth Padakanti <[email protected]>

* Coderrabbit issues

Signed-off-by: Srikanth Padakanti <[email protected]>

* Coderrabbit issues

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address comments

Signed-off-by: Srikanth Padakanti <[email protected]>

---------

Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Co-authored-by: Srikanth Padakanti <[email protected]>
# Conflicts:
#	integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
#	ppl/src/main/antlr/OpenSearchPPLParser.g4
asifabashar pushed a commit to asifabashar/sql that referenced this pull request Jan 28, 2026
* MvCombine Command Feature

Signed-off-by: Srikanth Padakanti <[email protected]>

* MvCombine Command Feature

Signed-off-by: Srikanth Padakanti <[email protected]>

* Add doctests to MvCombine

Signed-off-by: Srikanth Padakanti <[email protected]>

* spotlesscheck apply

Signed-off-by: Srikanth Padakanti <[email protected]>

* spotlesscheck apply

Signed-off-by: Srikanth Padakanti <[email protected]>

* spotlesscheck apply

Signed-off-by: Srikanth Padakanti <[email protected]>

* spotlessapply

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address coderrabbit comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Add mvcombine to index.md

Signed-off-by: Srikanth Padakanti <[email protected]>

* Remove the nomv related implementation as that command is still not yet implemented

Signed-off-by: Srikanth Padakanti <[email protected]>

* Remove the nomv related implementation as that command is still not yet implemented

Signed-off-by: Srikanth Padakanti <[email protected]>

* Remove the nomv related implementation as that command is still not yet implemented

Signed-off-by: Srikanth Padakanti <[email protected]>

* Remove the nomv related implementation as that command is still not yet implemented

Signed-off-by: Srikanth Padakanti <[email protected]>

* complete the checklist from ppl-commands.md

Signed-off-by: Srikanth Padakanti <[email protected]>

* spotlessApply

Signed-off-by: Srikanth Padakanti <[email protected]>

* Add visitMvCombine method to the FieldResolutionVisitor

Signed-off-by: Srikanth Padakanti <[email protected]>

* Apply spotlesscheck

Signed-off-by: Srikanth Padakanti <[email protected]>

* Add changes to exclude the metadata fields and remove the CAST logic

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address CrossClusterSearchIT comment

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address CrossClusterSearchIT comment

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address CrossClusterSearchIT comment

Signed-off-by: Srikanth Padakanti <[email protected]>

* Coderrabbit issues

Signed-off-by: Srikanth Padakanti <[email protected]>

* Coderrabbit issues

Signed-off-by: Srikanth Padakanti <[email protected]>

* Coderrabbit issues

Signed-off-by: Srikanth Padakanti <[email protected]>

* Coderrabbit issues

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address comments

Signed-off-by: Srikanth Padakanti <[email protected]>

* Address comments

Signed-off-by: Srikanth Padakanti <[email protected]>

---------

Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Co-authored-by: Srikanth Padakanti <[email protected]>
# Conflicts:
#	integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
#	ppl/src/main/antlr/OpenSearchPPLParser.g4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add mvcombine command in PPL

4 participants